feat(safeoutputs): add daily smoke suite and reject unknown safe-output keys#563
Conversation
…ut keys * Add tests/safe-outputs/ — 26 daily-scheduled agentic-pipeline fixtures covering all 22 production safe outputs plus a janitor, a queue-build target, and a smoke-failure-reporter that uses ado-aw-debug.create-issue (PR #492). * Add validate_safe_outputs_keys to fail compile on unrecognised safe-outputs keys (previously silently dropped with only a console warning). Includes a related_safe_output_names helper that lists known tools sharing the typo's first hyphen-segment as a hint. * Rename create_pr.rs -> create_pull_request.rs so the file name matches the YAML key. Fix the long-standing reply_to_pr_comment.rs name mismatch: the MCP tool was registered as reply-to-pr-review-comment while every user-facing doc, README, prompt, and site page said reply-to-pr-comment. Tool name now matches the docs across mcp.rs tool registration, execute.rs dispatch, and unit tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good — correctness solid, tests comprehensive. A few minor points worth addressing. Findings
|
|
@copilot implement suggestions |
…s_keys Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/3358604b-8ec8-4b6e-918b-81f8b68ef58f Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
Done in 90467d7. All three suggestions from the automated review are now addressed:
A new unit test Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
/rust-review |
|
✅ Rust PR Reviewer completed successfully! |
🔍 Rust PR ReviewSummary: Looks good — clean implementation with solid test coverage; one minor log-string inconsistency noted. Findings
|
|
@copilot implement suggestion |
Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/b3784ea7-b6cf-4671-9d6c-2371a37962e6 Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
Summary
Adds an end-to-end daily smoke suite for every production safe output and tightens the compiler so silently-dropped
safe-outputs:keys can no longer hide.26 daily smoke fixtures under
tests/safe-outputs/Each fixture is a minimal agentic-pipeline
.mdthat calls exactly one safe-output tool with predictable literal values, plus a committed.lock.yml. Schedule baked into every fixture; after the operator registers them once in theAgentPlaygroundADO sandbox the suite runs unattended.daily around 03:00) —noop,missing-data,missing-tool,report-incomplete,create-work-item,comment-on-work-item,update-work-item,link-work-items,create-branch,create-git-tag,create-wiki-page,update-wiki-page,add-build-tag,queue-build,create-pull-request,add-pr-comment,reply-to-pr-comment,resolve-pr-thread,submit-pr-review,update-pr,upload-build-attachment,upload-workitem-attachment,upload-pipeline-artifact.noop-target.md— trivial target pipeline that thequeue-buildsmoke fires at.janitor.md(weekly on monday around 02:00) — prunesado-aw-smoke-*artifacts older than 30 days.smoke-failure-reporter.md(daily around 04:30) — queries the ADO RESTbuildsendpoint, finds failed smoke runs, and usesado-aw-debug.create-issue(feat(safeoutputs): add ado-aw-debug.create-issue for dogfood pipelines #492) to file[smoke-failure] ...issues againstgithubnext/ado-aw.allowed-labels: [],max: 5, title-prefix dedupes against open issues. TheADO_AW_DEBUG_GITHUB_TOKENPAT must live only on this pipeline.README.mddocuments the suite layout;REGISTERED.mdis a contributor-maintainedfixture → ADO pipeline IDtable plus the manual-handoff checklist (service connections, perma-PR, variable group, token provisioning).Compiler: fail loud on unknown safe-output keys
generate_enabled_tools_argsused to skip unknown keys with only a console warning. Authoring this PR surfaced two real-world consequences:create_pr→create_pull_request(see below) initially usedcreate-pr:in fixtures and compiled cleanly with the tool dropped at runtime.reply-to-pr-commentMCP-name discrepancy (see below) is exactly this class of silent failure.The new
validate_safe_outputs_keysinsrc/compile/common.rsruns in the validation block before pipeline emission. Unknown keys now bail with a structured error listing all known tools that share the typo's first hyphen-segment, e.g.:The
memorymigration key remains a soft warning (back-compat);DEBUG_ONLY_TOOLSkeys keep their more specific error fromvalidate_ado_aw_debug_config. The plain Levenshtein "did you mean X?" heuristic was rejected — it returned misleading suggestions likeupdate-prforcreate-prbecause both end in-pr. Listing every same-prefix candidate is honest and lets the operator pick.File ↔ YAML-key alignment
src/safeoutputs/create_pr.rs→src/safeoutputs/create_pull_request.rs(now matches YAML keycreate-pull-request).git mvpreserves history.reply_to_pr_comment.rsdeclared MCP tool namereply-to-pr-review-comment, whiledocs/safe-outputs.md,README.md,prompts/*, andsite/all saidreply-to-pr-comment. Updated Rust to match docs acrosstool_result!,mcp.rsregistration,execute.rsdispatch, four unit tests, and internal log strings. Technically a behavioural rename of the wire-level MCP name, but no user could have invoked the previously-registered name successfully (it wasn't documented anywhere), so this is a fix rather than a breaking change.Test plan
cargo build— clean (warnings unchanged from baseline).cargo test— every existing test passes plus 9 new unit tests forvalidate_safe_outputs_keysandrelated_safe_output_names:safe-outputs:accept,create-pull-requestappears in the similar-tools hint forcreate-pr,create-issuenot double-flagged (deferred tovalidate_ado_aw_debug_config),memorymigration soft-path,related_safe_output_names("create-pr")includes everycreate-*and excludesupdate-pr,related_safe_output_names("fabricated-tool-name")returns empty.cargo clippy --all-targets --all-features -- -D warnings— 100+ baseline errors, 0 net-new from changed files (src/compile/common.rs,src/safeoutputs/create_pull_request.rs,src/safeoutputs/reply_to_pr_comment.rs,src/safeoutputs/mod.rs,src/mcp.rs,src/execute.rs).tests/bash_lint_tests.rswithENFORCE_BASH_LINT=1and shellcheck on PATH — passes (existing fixtures only; the new smoke fixtures are validated separately in the sweep below).ado-aw compileandado-aw check; the resulting bash bodies (mostlyset -euo pipefail+printfsetup steps) pass shellcheck.Follow-ups (out of scope, manual)
*.lock.ymlinhttps://dev.azure.com/msazuresphere/AgentPlayground, fill intests/safe-outputs/REGISTERED.mdin a follow-up docs PR.ADO_AW_DEBUG_GITHUB_TOKEN(secret, scopedIssues: Read and writeongithubnext/ado-awonly) on thesmoke-failure-reporterpipeline.engine.model: gpt-5-miniis available to copilot in the tenant before enabling the smokes; if not, update the model field across the fixture set.setup:placeholders injanitor.mdandresolve-pr-thread.mdwith realaz/ REST commands.